Fine-grained Full-text Search

نویسنده

  • Yasusi Kanada
چکیده

Most conventional text retrieval methods are designed to search for documents. However, users often do not require documents themselves, but are searching for specific information that may come from a large collection of texts quickly. To satisfy this need, we have developed a model and two methods for fine-grained searching. The unit of search in this model is called an atom, and it can be a sentence or smaller syntactic unit. A score, i.e., a relevance value, is defined for each atom and for each query, and the score is propagated between atoms. By using the two methods, excerpts from texts surrounding the search-result items and/or hyperlinks to the document parts that include the items are displayed. Multiple topics in a document can be separately listed in a search result. Evaluation of two prototypes, using a conventional full-text search engine as is or with only a small modification, has demonstrated that these methods are feasible and can decrease the search cost in terms of time and effort for users. 1998-12 (Partially updated on 2009-7)

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Integrating Scene Text and Visual Appearance for Fine-Grained Image Classification with Convolutional Neural Networks

Text in natural images contains rich semantics that are often highly relevant to objects or scene. In this paper, we focus on the problem of fully exploiting scene text for visual understanding. The main idea is combining word representations and deep visual features into a globally trainable deep convolutional neural network. First, the recognized words are obtained by a scene text reading sys...

متن کامل

Style Finder: Fine-Grained Clothing Style Recognition and Retrieval

With the rapid proliferation of smartphones and tablet computers, search has moved beyond text to other modalities like images and voice. For many applications like Fashion, visual search offers a compelling interface that can capture stylistic visual elements beyond color and pattern that cannot be as easily described using text. However, extracting and matching such attributes remains an extr...

متن کامل

A Sciento-Text Framework for Fine-Grained Characterization of the Leading World Institutions in Computer Science Research

Introduction This paper describes our experimental framework for a text analysis based fine-grained characterization of leading world institutions in Computer Science (CS) research. Though the present paper uses CS research output data from Web of Science, it can be extended and applied to any discipline and data source. The existing wellknown ranking systems, such as ARWU,Times Higher Educatio...

متن کامل

User-Defined Semantic Enrichment of Full-Text Documents: Experiences and Lessons Learned

Abstract. Semantic annotation of digital documents is typically done at metadata level. However, for fine-grained access semantic enrichment of text elements or passages is needed. Automatic annotation is not of sufficient quality to enable focused search and retrieval: either too many or too few terms are semantically annotated. User-defined semantic enrichment allows for a more targeted appro...

متن کامل

Augmenting Presentation MathML for Search

The ubiquity of text search is both a boon and bane for the quest for math search. A bane in that user’s expectations are high regarding accuracy, in-context highlighting and similar features. Yet also a boon with the availability of highly evolved search engine libraries; Youssef has previously shown how an appropriate ‘textualization’ of mathematics into an indexable form allows standard text...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2009